rank | frequency | n-gram |
---|---|---|
1 | 643284 | -а |
2 | 402620 | -и |
3 | 397312 | -е |
4 | 289721 | -о |
5 | 281309 | -т |
rank | frequency | n-gram |
---|---|---|
1 | 204782 | -те |
2 | 184912 | -та |
3 | 162924 | -от |
4 | 101852 | -то |
5 | 97063 | -на |
rank | frequency | n-gram |
---|---|---|
1 | 154538 | -ите |
2 | 147605 | -ата |
3 | 75143 | -иот |
4 | 47903 | -ија |
5 | 46630 | -ото |
rank | frequency | n-gram |
---|---|---|
1 | 46580 | -ните |
2 | 41304 | -ната |
3 | 38165 | -ката |
4 | 36258 | -ниот |
5 | 27999 | -ките |
rank | frequency | n-gram |
---|---|---|
1 | 18785 | -ијата |
2 | 18249 | -ската |
3 | 17860 | -скиот |
4 | 17034 | -ањето |
5 | 16347 | -ување |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings